Gesture recognition method, gesture recognition system, and performing device therefore

ABSTRACT

A performing device of a gesture recognition system executes a performing procedure of a gesture recognition method. The performing procedure includes steps of: receiving a sensing signal; selecting one of sensing frames of the sensing signal; determining a soft label of the selected sensing frame; classifying a gesture event when the soft label of the selected sensing frame is approved. The gesture event is classified to determine the motion of the user. Therefore, the gesture recognition system does not need a predetermined time period to recognize the motion of the user. The time period for recognizing the motion of the user can be dynamical. A total time period for classifying a plurality of motions can be decreased, and the performance of the gesture recognition system can be improved.

BACKGROUND OF THE INVENTION 1. Field of the Invention

The present invention relates to a recognition method and a recognition system, and more particularly to a gesture recognition method, a gesture recognition system, and a performing device thereof.

2. Description of the Related Art

Recognition systems generally receive sensing signals from a sensor to recognize the motion of the user. For example, the recognition system receives sensing signals from the sensor, processes the sensing signals using the recognition system, and utilizes the recognition system to implement a recognition method to determine whether a user being observed by the sensor is using portions of his or her body to make particular actions or form particular shapes or gestures. The recognition system classifies the motion of the user, and associates the motion of the user with executable commands or instructions.

However, the motion of the user may be a simple motion or a complex motion. When the motion of the user is the simple motion, the recognition system may quickly recognize the motion of the user. When the motion of the user is the complex motion, the recognition system may need more time to recognize the motion of the user. The recognition system cannot determine whether the motion of the user is the simple motion or the complex motion before the recognition system classifies the motion of the user.

Namely, the recognition system needs a predetermined time period to recognize the motion of the user. The predetermined time period needs to be associated with the complex motion to ensure the motion of the user can be clearly classified.

When the recognition system needs to classify a plurality of motions, the recognition system may need a total time period that is many times of the predetermined time period, which may waste time and negatively influence a performance of the recognition system.

Therefore, the recognition system needs to be further improved.

SUMMARY OF THE INVENTION

An objective of the present invention is to provide a gesture recognition method, a gesture recognition system, and a performing device of the gesture recognition system. The present invention may classify a motion of a user in a dynamical time period that is associated with the motion of the user. Therefore, the performance of the gesture recognition system can be improved.

The gesture recognition method includes a performing procedure;

wherein the performing procedure includes steps of:

-   -   receiving a sensing signal from a sensing unit; wherein the         sensing signal comprises a plurality of sensing frames;     -   selecting one of the sensing frames;     -   determining a soft label of the selected sensing frame according         to a neural network;     -   classifying the gesture event when the soft label of the         selected sensing frame is approved.

Further, the gesture recognition system includes a performing device and a training device. The performing device includes a sensing unit, a memory unit, and a processing unit.

The sensing unit senses a sensing signal and a training signal. The memory unit stores a neural network.

The processing unit is electrically connected to the sensing unit and the memory unit. The processing unit executes a performing procedure.

The performing procedure includes steps of:

-   -   receiving the sensing signal from the sensing unit; wherein the         sensing signal comprises a plurality of sensing frames;     -   selecting one of the sensing frames;     -   classifying a gesture event of the selected sensing frame         according to a neural network;     -   determining a soft label of the selected sensing frame according         to the neural network;     -   classifying the gesture event when the soft label of the         selected sensing frame is approved.

The training device is electrically connected to the performing device, and executes a training procedure. The training procedure includes steps of:

-   -   receiving the training signal from the sensing unit of the         performing device, wherein the training signal comprises a         plurality of training frames;     -   determining an amount of the training frames of the training         signal;     -   determining a function according to the amount of the training         frames of the training signal;     -   calculating a soft label of each training frame of the training         signal according to the function; and     -   training the neural network with the soft labels of the training         frames of the training signal as ground truth of the neural         network.

When the gesture recognition system classifies the gesture event to determine the motion of the user, the gesture recognition system can classify the gesture event. Therefore, the gesture recognition system does not need a predetermined time period to recognize the motion of the user. The time period for recognizing the motion of the user can be dynamical, and a total time period for classifying a plurality of motions can be decreased. Namely, the performance of the gesture recognition system can be improved.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flowchart of a performing procedure of a gesture recognition method of the present invention;

FIG. 2 is a flowchart of a training procedure of the gesture recognition method of the present invention;

FIG. 3 is a block diagram of a gesture recognition system of the present invention; and

FIG. 4 is a schematic diagram of the relationship between soft labels and training frames.

DETAILED DESCRIPTION OF THE INVENTION

With reference to FIGS. 1 and 2, the present invention relates to a gesture recognition method, a gesture recognition system, and a performing device of the gesture recognition system. The gesture recognition method includes a performing procedure.

The performing procedure includes steps of:

-   -   receiving a sensing signal from a sensing unit (S101); wherein         the sensing signal comprises a plurality of sensing frames;     -   selecting one of the sensing frames (S102);     -   determining a soft label of the selected sensing frame according         to a neural network (S103);     -   classifying the gesture event when the soft label of the         selected sensing frame is approved (S104).

Since the gesture event can be classified according to the soft label, the gesture recognition method does not need a predetermined time period to recognize the motion of the user. The time period for recognizing the motion of the user can be dynamical, and a total time period for classifying a plurality of motions can be decreased. Therefore, the performance of the gesture recognition method can be improved.

Moreover, the gesture recognition method further includes a training procedure, and the training procedure includes steps of:

-   -   receiving a training signal (S201); wherein the raw data         comprises a plurality of training frames;     -   determining an amount of the training frames (S202);     -   determining a function according to the amount of the training         frames (S203);     -   calculating a soft label of each training frame according to the         function (S204); and     -   training a neural network with the soft labels of the training         frames as ground truth of the neural network (S205);

In the embodiment, the soft labels of the training frames are encoded into the one-hot vectors to train the neural network.

Moreover, the function is a monotonic function, a first training frame of the training signal is mapped to the soft label of zero through the function, and a last training frame of the training signal is mapped to the soft label of one through the function.

With reference to FIG. 3, the gesture recognition system includes a performing device 10 and a training device 20. The training device 10 includes a sensing unit 101, a memory unit 102, and a processing unit 103.

The sensing unit 101 senses the sensing signal and the training signal. The memory unit 102 stores the neural network. The processing unit 103 is electrically connected to the sensing unit 101 and the memory unit 102. The processing unit 103 executes the performing procedure as described above.

The training device 20 is electrically connected to the performing device 10, and executes the training procedure as described above. In the training procedure, the training signal is received from the sensing unit 101 of the performing device 10.

With reference to FIG. 4, in the embodiment, the function is a monotonic function, a first training frame 11 of the training signal is mapped to the soft label of zero through the function, and a last training frame 1 n of the training signal is mapped to the soft label of one through the function.

For example, the training signal includes n frames, the soft label of the first training frame 11 is 0, the soft label of the second training frame 12 is 0.1, the soft label of the third training frame is 0.2, and the last training frame, such as the nth training frame 1 n, is 1.

In another embodiment, the performing procedure further includes steps of:

determining whether the soft label of the selected sensing frame exceeds a confidence threshold before classifying the gesture event (S1031);

when the soft label of the selected sensing frame exceeds the confidence threshold, the soft label of the selected sensing frame is approved, and the gesture event is classified;

when the soft label of the selected sensing frame does not exceed the confidence threshold, selecting another one of the sensing frames (S1032), determining a soft label of the selected another one of the sensing frames according to the neural network (S1033), and determining again whether the soft label of said selected another sensing frame exceeds the confidence threshold (S1031).

The gesture recognition method classifies the gesture event when the soft label of the selected sensing frame exceeds the confidence threshold. Therefore, the gesture recognition method does not need to classify each time when the soft label is calculated, and computation for classifying the gesture event can be reduced. Moreover, the performance of the gesture recognition method can be further improved.

Further, the training procedure can be executed two times for receiving two said training signals. A percentage of a sequence of the training frames of one of the training signals equals to a percentage of a sequence of the training frames of another one of the training signals having the same soft label when the soft label equals to the confidence threshold.

For example, the confidence threshold is 0.8. When the amount of the training frames a first training signal is 20, the 80% of the sequence of the training frames of the first training signal is the 16^(th) training frame of the first training signal, and the soft label of the 16^(th) training frame of the first training signal is 0.8.

When the amount of the training frames of a second training signal is 10, the 80% of the sequence of the training frames of the second training signal is the 8^(th) training frame of the second training signal, and the soft label of the 8^(th) training frame of the first training signal is also 0.8.

Even though numerous characteristics and advantages of the present invention have been set forth in the foregoing description, together with details of the structure and function of the invention, the disclosure is illustrative only. Changes may be made in detail, especially in matters of shape, size, and arrangement of parts within the principles of the invention to the full extent indicated by the broad general meaning of the terms in which the appended claims are expressed. 

What is claimed is:
 1. A gesture recognition method, comprising a performing procedure; wherein the performing procedure comprises steps of: receiving a sensing signal from a sensing unit; wherein the sensing signal comprises a plurality of sensing frames; selecting one of the sensing frames; determining a soft label of the selected sensing frame according to a neural network; classifying a gesture event when to the soft label of the selected sensing frame is approved.
 2. The gesture recognition method as claimed in claim 1, wherein the soft label of the sensing frame is encoded into one-hot vectors to train the neural network.
 3. The gesture recognition method as claimed in claim 1, wherein the performing procedure comprises steps of: determining whether the soft label of the selected sensing frame exceeds a confidence threshold before classifying the gesture event; when the soft label of the selected sensing frame exceeds the confidence threshold, the soft label of the selected sensing frame is approved, and the gesture event is classified.
 4. The gesture recognition method as claimed in claim 3, wherein the performing procedure further comprises steps of: when the soft label of the selected sensing frame does not exceed the confidence threshold, selecting another one of the sensing frames, determining a soft label of the selected another one of the sensing frames according to the neural network, and determining again whether the soft label of said selected another sensing frame exceeds the confidence threshold.
 5. The gesture recognition method as claimed in claim 1, further comprising a training procedure; wherein the training procedure comprises steps of: receiving a training signal; wherein the training signal comprises a plurality of training frames; determining an amount of the training frames; determining a function according to the amount of the training frames; calculating a soft label of each training frame according to the function; and training the neural network with the soft labels of the training frames as ground truth of the neural network.
 6. The gesture recognition method as claimed in claim 5, wherein the training procedure is executed two times for receiving two said training signals; wherein a percentage of a sequence of the training frames of one of the training signals equals to a percentage of a sequence of the training frames of another one of the training signals having the same soft label when the soft label equals to the confidence threshold.
 7. The gesture recognition method as claimed in claim 6, wherein: the function is a monotonic function; the training frames of the training signal are arranged in sequence; a first training frame of the training signal is mapped to the soft label of zero through the function; and a last training frame of the training signal is mapped to the soft label of one through the function.
 8. A gesture recognition system, comprising a performing device and a training device; wherein the performing device comprises: a sensing unit, sensing a sensing signal and a training signal; a memory unit, storing a neural network; a processing unit, electrically connected to the sensing unit and the memory unit; wherein the processing unit executes a performing procedure; wherein the performing procedure comprises steps of: receiving the sensing signal from the sensing unit; wherein the sensing signal comprises a plurality of sensing frames; selecting one of the sensing frames; determining a soft label of the selected sensing frame according to the neural network; classifying a gesture event when the soft label of the selected sensing frame is approved; wherein the training device is electrically connected to the performing device, and executes a training procedure; wherein the training procedure comprises steps of: receiving the training signal from the sensing unit of the performing device; wherein the training signal comprises a plurality of training frames; determining an amount of the training frames of the training signal; determining a function according to the amount of the training frames of the training signal; calculating a soft label of each training frame of the training signal according to the function; and training the neural network with the soft labels of the training frames of the training signal as ground truth of the neural network.
 9. The gesture recognition system as claimed in claim 8, wherein: the function is a monotonic function; the training frames of the training signal are arranged in sequence; a first training frame of the training signal is mapped to the soft label of zero through the function; and a last training frame of the training signal is mapped to the soft label of one through the function.
 10. The gesture recognition system as claimed in claim 8, wherein the soft labels of the sensing frames of the sensing signal are encoded into one-hot vectors to train the neural network.
 11. The gesture recognition system as claimed in claim 8, wherein the performing procedure further comprises steps of: determining whether the soft label of the selected sensing frame exceeds a confidence threshold before classifying the gesture event; when the soft label of the selected sensing frame exceeds the confidence threshold, the soft label of the selected sensing frame is approved, and the gesture event is classified.
 12. The gesture recognition system as claimed in claim 11, wherein when the soft label of the selected sensing frame of the sensing signal does not exceed the confidence threshold, the processing unit selects another one of the sensing frames, and further determines a soft label of the selected another one of the sensing frames according to the neural network, and the processing unit determines again whether the soft label of said selected another sensing frame exceeds the confidence threshold.
 13. The gesture recognition system as claimed in claim 11, wherein the training procedure is executed two times for receiving two said training signals; wherein a percentage of a sequence of the training frames of one of the training signals equals to a percentage of a sequence of the training frames of another one of the training signals having the same soft label when the soft label equals to the confidence threshold.
 14. A performing device, comprising: a sensing unit, sensing a sensing signal and a training signal; a memory unit, storing a neural network; a processing unit, electrically connected to the sensing unit and the memory unit; wherein the processing unit executes a performing procedure; wherein the performing procedure comprises steps of: receiving the sensing signal from the sensing unit; wherein the sensing signal comprises a plurality of sensing frames; selecting one of the sensing frames; determining a soft label of the selected sensing frame according to the neural network; classifying the gesture event when the soft label of the selected sensing frame is approved.
 15. The performing device as claimed in claim 14, wherein the soft labels of the sensing frames are encoded into one-hot vectors to train the neural network.
 16. The performing device as claimed in claim 14, wherein the performing procedure further comprises steps of: determining whether the soft label of the selected sensing frame exceeds a confidence threshold before classifying the gesture event; when the soft label of the selected sensing frame exceeds the confidence threshold, the soft label of the selected sensing frame is approved, and the gesture event is classified.
 17. The performing device as claimed in claim 16, wherein when the soft label of the selected sensing frame of the sensing signal does not exceed the confidence threshold, the processing unit selects another one of the sensing frames, classifies a gesture event of the selected sensing frame according to the neural network, and further determines a soft label of the selected another one of the sensing frames according to the neural network, and the processing unit determines again whether the soft label of said selected another sensing frame exceeds the confidence threshold. 